Low-Cost Comparisons of File Copies
نویسندگان
چکیده
The problem of maintaining consistency of replicas of large files has been addressed b y Barbara, Feijoo and Garcia-Molina [l], Barbara and Lipton [2], Fuchs, W u and Abraham [5] and Metzner [6]. Our model is essentially identical to that previously assumed. We present a scheme that provides all capabilities previously obtained as well as being able to detect and identify missing as well as extraneous pages. Our solution is an adaptation of Metzner’s approach. We assume the existence of a signature function for individual pages. A supersignature is obtained as a power series in a primitive root within the Galois field with 2” elements. The coefficients are the signatures of the individual pages. This scheme enables us to detect errors such as missing, altered or incorrectly p l a c e d pages with very high probability and t o find an error diagnosis i f discrepancies have been detected; The problem solved in this paper is typified by the following situation. Consider a large database, replicas of which are situated a t several sites. Each site keeps its own log file of updates and in order to insure consistency, these log files are compared periodically. The physical organization of the log file consists of pages. *This work was supported by the University of California MICRO program as well as the NCR Corporation, Dayton, OH Our model is the one usually adopted for this variety of problem. A file is organized as a sequence of pages and complete copies of the file are maintained a t distinct sites throughout the network. We assume that few discrepancies occur and note that our scheme will not accommodate catastrophic failures. The cost of sending information through the network is high compared t o the cost of calculations a t an individual site. The order of pages is important and there might be missing pages besides faulty ones. As usual, our solution avoids the costly and tedious task of bitwise comparison between the pages stored a t different sites. Rather, we compare “signatures” of the individual pages by means of a supersignature calculated from the individual page signature. This allows conclusions that two copies are identical only with a previously chosen arbitrarily high probability. The cost of higher accuracy will be additional bits within the signature. Our scheme is based on the calculation of the supersignature. We interpret page signatures as elements of the Galois field 2; of 2” elements, but we disallow 0 as a value for a page signature. (We later propose variants that use more common data structures.) Let g be a fixed primitive root of 27, i.e. a generator of the multiplicative group of 2;. The supersignature is then calculated as the weighted sum of the page signatures with the powers of g. CH2878-7/90/0000/0196$01 .OO
منابع مشابه
An Analysis of Security Vulnerabilities in the Movie Production and Distribution Process
Unauthorized copying of movies is a major concern for the motion picture industry. While unauthorized copies of movies have been distributed via portable physical media for some time, low-cost, high-bandwidth Internet connections and peer-to-peer file sharing networks provide highly efficient distribution media. Many movies are showing up on file sharing networks shortly after, and in some case...
متن کاملIs The Answer to the Machine Really in the Machine? Technical copyright protection and file-sharing communities
E-.commerce and content distribution models rely on working copyright protection. Copyright is a part of the necessary trust infrastructure of ecommerce for immaterial goods.The recently introduced legal protection of technical measures is theoretically interesting in that it uses legislation to protect a certain form of architecture. This seemingly double protection is also highly paradoxical,...
متن کاملMinimum cost mirror sites using network coding: Replication vs. coding at the source nodes
Content distribution over networks is often achieved by using mirror sites that hold copies of files or portions thereof to avoid congestion and delay issues arising from excessive demands to a single location. Accordingly, there are distributed storage solutions that divide the file into pieces and place copies of the pieces (replication) or coded versions of the pieces (coding) at multiple so...
متن کاملAchieving Strong Consistency in a Distributed File System
Distributed file systems nowadays need to provide for fault tolerance. This is typically achieved with the replication of files. Existing approaches to the construction of replicated file systems sacrifice strong semantics (i.e., the guarantees the systems make to running computations when failures occur and/or files are accessed concurrently). This is done mainly for efficiency reasons. This p...
متن کاملOn the Optimality of Forest-Type File Transfers on a File Transmission Net
A problem of obtaining an optimal file transfer on a file transmission net N is to consider how to transmit, with a minimum total cost, copies of a certain file of information from some vertices to others on N by the respective vertices’ copy demand numbers. This problem is NP-hard for a general file transmission net. So far, some class of N on which polynomial time algorithms for obtaining an ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1990